AITopics | homography estimation

Collaborating Authors

homography estimation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Fine-Grained Cross-View Geo-Localization Using a Correlation-Aware Homography Estimator

Neural Information Processing SystemsApr-25-2026, 00:09:46 GMT

In this paper, we introduce a novel approach to fine-grained cross-view geolocalization.

artificial intelligence, machine learning, survey article, (15 more...)

Neural Information Processing Systems

Country:

North America > United States (0.68)
Asia (0.46)

Genre:

Research Report > Promising Solution (0.48)
Overview > Innovation (0.34)

Industry: Information Technology (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (0.70)
Information Technology > Geographic Information Systems (0.68)
(2 more...)

Add feedback

Unsupervised Homography Estimation on Multimodal Image Pair via Alternating Optimization

Neural Information Processing SystemsFeb-15-2026, 18:53:25 GMT

In response, unsupervised learning approaches have emerged.

artificial intelligence, deep learning, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > South Korea > Seoul > Seoul (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
(6 more...)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry: Information Technology (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

70d4ef44dc973586cfa3ea92b4868b72-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 05:45:18 GMT

dataset, experiment, image pair, (14 more...)

Neural Information Processing Systems

Country:

Asia > South Korea > Seoul > Seoul (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
(6 more...)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry: Information Technology (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Auto-regressive transformation for image alignment

Lee, Kanggeon, Lee, Soochahn, Lee, Kyoung Mu

arXiv.org Artificial IntelligenceMay-9-2025

Robustness to these challenges improves through iterative refinement of the transformation field while focusing on critical regions in multi-scale image representations. We thus propose Auto-Regressive Transformation (ART), a novel method that iteratively estimates the coarse-to-fine transformations within an auto-regressive framework. Leveraging hierarchical multi-scale features, our network refines the transformations using randomly sampled points at each scale. By incorporating guidance from the cross-attention layer, the model focuses on critical regions, ensuring accurate alignment even in challenging, feature-limited conditions. Extensive experiments across diverse datasets demonstrate that ART significantly outperforms state-of-the-art methods, establishing it as a powerful new method for precise image alignment with broad applicability. Image alignment is a fundamental problem in computer vision that involves registering images captured from different perspectives, times, or modalities by estimating an optimal spatial transformation. The process is essential for achieving seamless integration and analysis of images.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2505.04864

Country:

Asia > South Korea > Seoul > Seoul (0.05)
North America > United States (0.04)
Europe > Switzerland (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)

Genre: Research Report > Promising Solution (0.54)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.30)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Unsupervised Homography Estimation on Multimodal Image Pair via Alternating Optimization

Song, Sanghyeob, Lew, Jaihyun, Jang, Hyemi, Yoon, Sungroh

arXiv.org Artificial IntelligenceNov-19-2024

Estimating the homography between two images is crucial for mid- or high-level vision tasks, such as image stitching and fusion. However, using supervised learning methods is often challenging or costly due to the difficulty of collecting ground-truth data. In response, unsupervised learning approaches have emerged. Most early methods, though, assume that the given image pairs are from the same camera or have minor lighting differences. Consequently, while these methods perform effectively under such conditions, they generally fail when input image pairs come from different domains, referred to as multimodal image pairs. To address these limitations, we propose AltO, an unsupervised learning framework for estimating homography in multimodal image pairs. Our method employs a two-phase alternating optimization framework, similar to Expectation-Maximization (EM), where one phase reduces the geometry gap and the other addresses the modality gap. To handle these gaps, we use Barlow Twins loss for the modality gap and propose an extended version, Geometry Barlow Twins, for the geometry gap. As a result, we demonstrate that our method, AltO, can be trained on multimodal datasets without any ground-truth data. It not only outperforms other unsupervised methods but is also compatible with various architectures of homography estimators. The source code can be found at:~\url{https://github.com/songsang7/AltO}

artificial intelligence, image pair, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2411.13036

Country:

Asia > South Korea > Seoul > Seoul (0.05)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
(7 more...)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Efficient Depth Estimation for Unstable Stereo Camera Systems on AR Glasses

Liu, Yongfan, Kwon, Hyoukjun

arXiv.org Artificial IntelligenceNov-15-2024

Stereo depth estimation is a fundamental component in augmented reality (AR) applications. Although AR applications require very low latency for their real-time applications, traditional depth estimation models often rely on time-consuming preprocessing steps such as rectification to achieve high accuracy. Also, non standard ML operator based algorithms such as cost volume also require significant latency, which is aggravated on compute resource-constrained mobile platforms. Therefore, we develop hardware-friendly alternatives to the costly cost volume and preprocessing and design two new models based on them, MultiHeadDepth and HomoDepth. Our approaches for cost volume is replacing it with a new group-pointwise convolution-based operator and approximation of consine similarity based on layernorm and dot product. For online stereo rectification (preprocessing), we introduce homograhy matrix prediction network with a rectification positional encoding (RPE), which delivers both low latency and robustness to unrectified images, which eliminates the needs for preprocessing. Our MultiHeadDepth, which includes optimized cost volume, provides 11.8-30.3% improvements in accuracy and 22.9-25.2% reduction in latency compared to a state-of-the-art depth estimation model for AR glasses from industry. Our HomoDepth, which includes optimized preprocessing (Homograhpy + RPE) upon MultiHeadDepth, can process unrectified images and reduce the end-to-end latency by 44.5%. We adopt a multi-task learning framework to handle misaligned stereo inputs on HomoDepth, which reduces theAbsRel error by 10.0-24.3%. The results demonstrate the efficacy of our approaches in achieving both high model performance with low latency, which makes a step forward toward practical depth estimation on future AR devices.

artificial intelligence, image understanding, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2411.10013

Country:

North America > United States > California > Orange County > Irvine (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Vision > Image Understanding (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

STHN: Deep Homography Estimation for UAV Thermal Geo-localization with Satellite Imagery

Xiao, Jiuhong, Zhang, Ning, Tortei, Daniel, Loianno, Giuseppe

arXiv.org Artificial IntelligenceMay-30-2024

Accurate geo-localization of Unmanned Aerial Vehicles (UAVs) is crucial for a variety of outdoor applications including search and rescue operations, power line inspections, and environmental monitoring. The vulnerability of Global Navigation Satellite Systems (GNSS) signals to interference and spoofing necessitates the development of additional robust localization methods for autonomous navigation. Visual Geo-localization (VG), leveraging onboard cameras and reference satellite maps, offers a promising solution for absolute localization. Specifically, Thermal Geo-localization (TG), which relies on image-based matching between thermal imagery with satellite databases, stands out by utilizing infrared cameras for effective night-time localization. However, the efficiency and effectiveness of current TG approaches, are hindered by dense sampling on satellite maps and geometric noises in thermal query images. To overcome these challenges, in this paper, we introduce STHN, a novel UAV thermal geo-localization approach that employs a coarse-to-fine deep homography estimation method. This method attains reliable thermal geo-localization within a 512-meter radius of the UAV's last known location even with a challenging 11% overlap between satellite and thermal images, despite the presence of indistinct textures in thermal imagery and self-similar patterns in both spectra. Our research significantly enhances UAV thermal geo-localization performance and robustness against the impacts of geometric noises under low-visibility conditions in the wild. The code will be made publicly available.

estimation, homography estimation, thermal image, (16 more...)

arXiv.org Artificial Intelligence

2405.2047

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > United States > New York > Kings County > New York City (0.04)

Genre:

Research Report > New Finding (0.68)
Research Report > Promising Solution (0.48)

Industry:

Energy > Power Industry (0.66)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.50)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.34)

Add feedback

No Bells, Just Whistles: Sports Field Registration by Leveraging Geometric Properties

Gutiérrez-Pérez, Marc, Agudo, Antonio

arXiv.org Artificial IntelligenceApr-12-2024

Broadcast sports field registration is traditionally addressed as a homography estimation task, mapping the visible image area to a planar field model, predominantly focusing on the main camera shot. Addressing the shortcomings of previous approaches, we propose a novel calibration pipeline enabling camera calibration using a 3D soccer field model and extending the process to assess the multiple-view nature of broadcast videos. Our approach begins with a keypoint generation pipeline derived from SoccerNet dataset annotations, leveraging the geometric properties of the court. Subsequently, we execute classical camera calibration through DLT algorithm in a minimalist fashion, without further refinement. Through extensive experimentation on real-world soccer broadcast datasets such as SoccerNet-Calibration, WorldCup 2014 and TS- WorldCup, our method demonstrates superior performance in both multiple- and single-view 3D camera calibration while maintaining competitive results in homography estimation compared to state-of-the-art techniques.

camera calibration, dataset, registration, (14 more...)

arXiv.org Artificial Intelligence

2404.08401

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Spain (0.04)

Genre: Research Report > Promising Solution (0.49)

Industry: Leisure & Entertainment > Sports > Soccer (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.60)

Add feedback

Are Semi-Dense Detector-Free Methods Good at Matching Local Features?

Vilain, Matthieu, Giraud, Rémi, Germain, Hugo, Bourmaud, Guillaume

arXiv.org Artificial IntelligenceFeb-13-2024

Semi-dense detector-free approaches (SDF), such as LoFTR, are currently among the most popular image matching methods. While SDF methods are trained to establish correspondences between two images, their performances are almost exclusively evaluated using relative pose estimation metrics. Thus, the link between their ability to establish correspondences and the quality of the resulting estimated pose has thus far received little attention. This paper is a first attempt to study this link. We start with proposing a novel structured attention-based image matching architecture (SAM). It allows us to show a counter-intuitive result on two datasets (MegaDepth and HPatches): on the one hand SAM either outperforms or is on par with SDF methods in terms of pose/homography estimation metrics, but on the other hand SDF approaches are significantly better than SAM in terms of matching accuracy. We then propose to limit the computation of the matching accuracy to textured regions, and show that in this case SAM often surpasses SDF methods. Our findings highlight a strong correlation between the ability to establish accurate correspondences in textured regions and the accuracy of the resulting estimated pose/homography. Our code will be made available.

correspondence, correspondent, sdf method, (14 more...)

arXiv.org Artificial Intelligence

2402.08671

Country: Europe > France (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Novel OCT mosaicking pipeline with Feature- and Pixel-based registration

Wang, Jiacheng, Li, Hao, Hu, Dewei, Tao, Yuankai K., Oguz, Ipek

arXiv.org Artificial IntelligenceNov-21-2023

High-resolution Optical Coherence Tomography (OCT) images are crucial for ophthalmology studies but are limited by their relatively narrow field of view (FoV). Image mosaicking is a technique for aligning multiple overlapping images to obtain a larger FoV. Current mosaicking pipelines often struggle with substantial noise and considerable displacement between the input sub-fields. In this paper, we propose a versatile pipeline for stitching multi-view OCT/OCTA \textit{en face} projection images. Our method combines the strengths of learning-based feature matching and robust pixel-based registration to align multiple images effectively. Furthermore, we advance the application of a trained foundational model, Segment Anything Model (SAM), to validate mosaicking results in an unsupervised manner. The efficacy of our pipeline is validated using an in-house dataset and a large public dataset, where our method shows superior performance in terms of both accuracy and computational efficiency. We also made our evaluation tool for image mosaicking and the corresponding pipeline publicly available at \url{https://github.com/MedICL-VU/OCT-mosaicking}.

dataset, registration, segmentation, (16 more...)

arXiv.org Artificial Intelligence

2311.13052

Genre: Research Report (1.00)

Industry:

Health & Medicine > Diagnostic Medicine (0.70)
Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)

Add feedback